摘要 :
The 2011 Tsunami event in the eastern coastal area of Japan caused a huge amount of damages or devastations on buildings. To this date, several field surveys have been conducted which provide detailed information about inundation ...
展开
The 2011 Tsunami event in the eastern coastal area of Japan caused a huge amount of damages or devastations on buildings. To this date, several field surveys have been conducted which provide detailed information about inundation areas and building damage characteristics in attacking east coastal areas by this tsunami. In this study, building damage data of Ishinomaki city, with special attention to the plain coast affected area, are classified and analyzed using data surveyed by the Ministry of Lands, Infrastructure and Transportation of Japan (MLIT) for more than 52,000 structures. The classification includes information on six levels of damage, four types of building materials and damages due to tsunami inundation for each building material which are necessary information for an effective hazard mitigation. Notably, damage level percentage distribution of different building materials is plotted for different inundation depth ranges in several sets of figures. This graphic illustration not only shows a better resistant performance of Reinforced Concrete (RC) and steel buildings over wood or other buildings for all inundation depth ranges, but also can explain clearly the inundation-induced damage behavior for each building material as well as the threshold depth for each damage level. Moreover, this research contains an analysis of vulnerable areas due to the coastal topography and the geographical factors. Surveyed data provided by Geospatial information authority of Japan (GSI) that classifies Ishinomaki plain coast area into three classes are compared with the damage map produced using an Analytical Hierarchy Process (AHP) methodology in ArcGIS 10.2 environment. The influence of key geographical features on tsunami-induced building damage, notably Kitakami river and water canals flooding, is taken into account with respect to the weighting of factors. A good agreement produced building damage map with surveyed GSI data shows the power of a GIS tool based on the AHP approach for tsunami damage assessment. The results of this study are useful to understand the damage behavior of buildings with different structural materials located in coastal areas vulnerable to the tsunami disaster.
收起
摘要 :
In a real environment, acoustic and language features often vary depending on the speakers, speaking styles and topic changes. To accommodate these changes, speech recognition approaches that include the incremental tracking of ch...
展开
In a real environment, acoustic and language features often vary depending on the speakers, speaking styles and topic changes. To accommodate these changes, speech recognition approaches that include the incremental tracking of changing environments have attracted attention. This paper proposes a topic tracking language model that can adaptively track changes in topics based on current text information and previously estimated topic models in an on-line manner. The proposed model is applied to language model adaptation in speech recognition. We use the MIT OpenCourseWare corpus and Corpus of Spontaneous Japanese in speech recognition experiments, and show the effectiveness of the proposed method.
收起
摘要 :
Because of the media digitization, a large amount of infor- mation such as speech, audio and video data is produced everyday. In order to retrieve data for these databases quickly and precisely, multimedia tech- nologies for struc...
展开
Because of the media digitization, a large amount of infor- mation such as speech, audio and video data is produced everyday. In order to retrieve data for these databases quickly and precisely, multimedia tech- nologies for structuring and retrieving of speech, audio and video data are strongly required. In this paper, we overview the multimedia technologies such as structuring and retrieval of speech, audio an video data, speaker indexing, audio summarization and cross media retrieval existing today for RV news detabase.
收起
摘要 :
This study proposes a speech recognition method which is made robust to noise by combining speech signal estimation based on GMM and speech enhancement based on SVD in the temporal domain. Conventional speech signal estimation bas...
展开
This study proposes a speech recognition method which is made robust to noise by combining speech signal estimation based on GMM and speech enhancement based on SVD in the temporal domain. Conventional speech signal estimation based on GMM has the problems that the time dependence of the noise is not considered and the performance is degraded in a low-SNR environment. As regards the first problem, successive updating of the mean noise vector is performed in this study to follow the time variation of the noise. As regards the second problem, an attempt is made to improve performance by improving the SNR beforehand by means of speech enhancement based on SVD in the time domain. Furthermore, in speech enhancement based on SVD in the time domain, the over-subtraction factor for the noise component is introduced in order to minimize the effect of noise, and adaptive determination of the factor is considered. The proposed method is evaluated using the AURORA2 database, and it is shown that the speech recognition accuracy is improved compared to conventional speech signal estimation based on GMM.
收起
摘要 :
Our research focuses on the question of classifiers that are capable of processing images rapidly and accurately without having to rely on a large-scale dataset, thus presenting a robust classification framework for both facial ex...
展开
Our research focuses on the question of classifiers that are capable of processing images rapidly and accurately without having to rely on a large-scale dataset, thus presenting a robust classification framework for both facial expression recognition (FER) and object recognition. The framework is based on support vector machines (SVMs) and employs three key approaches to enhance its robustness. First, it uses the perturbed subspace method (PSM) to extend the range of sample space for task sample training, which is an effective way to improve the robustness of a training system. Second, the framework adopts Speeded Up Robust Features (SURF) as features, which is more suitable for dealing with real-time situations. Third, it introduces region attributes to evaluate and revise the classification results based on SVMs. In this way, the classifying ability of SVMs can be improved. Combining these approaches, the proposed method has the following beneficial contributions. First, the efficiency of SVMs can be improved. Experiments show that the proposed approach is capable of reducing the number of samples effectively, resulting in an obvious reduction in training time. Second, the recognition accuracy is comparable to that of state-of-the-art algorithms. Third, its versatility is excellent, allowing it to be applied not only to object recognition but also FER. Keywords PSM SVMs SURF Region attributes Object recognition Facial expression recognition
收起
摘要 :
This paper presents a voice conversion (VC) method that utilizes conditional restricted Boltzmann machines (CRBMs) for each speaker to obtain high-order speaker-independent spaces where voice features are converted more easily tha...
展开
This paper presents a voice conversion (VC) method that utilizes conditional restricted Boltzmann machines (CRBMs) for each speaker to obtain high-order speaker-independent spaces where voice features are converted more easily than those in an original acoustic feature space. The CRBM is expected to automatically discover common features lurking in time-series data. When we train two CRBMs for a source and target speaker independently using only speaker-dependent training data, it can be considered that each CRBM tries to construct subspaces where there are fewer phonemes and relatively more speaker individuality than the original acoustic space because the training data include various phonemes while keeping the speaker individuality unchanged. Each obtained high-order feature is then concatenated using a neural network (NN) from the source to the target. The entire network (the two CRBMs and the NN) can be also fine-tuned as a recurrent neural network (RNN) using the acoustic parallel data since both the CRBMs and the concatenating NN have network-based representation with time dependencies. Through voice-conversion experiments, we confirmed the high performance of our method especially in terms of objective evaluation, comparing it with conventional GMM, NN, RNN, and our previous work, speaker-dependent DBN approaches. Keywords Voice conversion Conditional restricted Boltzmann machine Deep learning Recurrent neural network Speaker-specific features
收起
摘要 :
This paper proposes a speech signal estimation method based on the Kalman filter, as preprocessing for speech recognition in a noisy environment. Hitherto, the Kalman filter has been considered unsuited to real-time processing, si...
展开
This paper proposes a speech signal estimation method based on the Kalman filter, as preprocessing for speech recognition in a noisy environment. Hitherto, the Kalman filter has been considered unsuited to real-time processing, since it requires a tremendous amount of computation. Consequently, the purpose of this paper is to reduce the amount of computation in the Kalman filter and to propose a speech signal estimation method for real-time processing, using high-speed operation. In order to evaluate the proposed method, a word recognition experiment was performed, using a speech signal extracted from speed with superposed noise. The accuracy of the word recognition tests is compared to the conventional spectral subtraction method and the parallel model combination method in order to demonstrate that the proposed method can deal automatically with various kinds of stationary noise without manual adjustment of the filter parameters for the conditions, such as the speaker, the kind of noise, and the SNR. For this purpose, the range of noise compensation by the proposed method is investigated. It is verified that the proposed method achieves a high word recognition rate, even in the presence of noise that degraded the recognition rate in the conventional method. In particular, the proposed method is effective in environments with a low SNR.
收起
摘要 :
Geographic Information Systems (GIS), image processing in remote sensing and analytical hierarchy process (AHP) were used to estimate and classify vulnerability and inundation areas under the Tohoku tsunami event 2011 in the Ishin...
展开
Geographic Information Systems (GIS), image processing in remote sensing and analytical hierarchy process (AHP) were used to estimate and classify vulnerability and inundation areas under the Tohoku tsunami event 2011 in the Ishinomaki, Miyagi prefecture, Japan. Acceptable data were obtained from Geoeye-1 satellite image, GSI DEM and field survey. Five factors of elevation, slope, shoreline distance, river distance and vegetation were used to classify the vulnerability and be weighted via AHP. By assessing the estimated and classified vulnerability map and comparing it with the inundation map of the study area, we found that a 13.44 km 2 ) area came under the tsunami vulnerability zone. Inundation areas were located in high and slightly high vulnerability classifications. Kitakami river and the Unga water canal played the role of flooding strips by transporting tsunami waves into the hinterland. This research is important to understand the roles of main topographical factors in a tsunami disaster.
收起
摘要 :
References(21) This paper presents a talker localization method using only a single microphone, where phoneme hidden Markov models (HMMs) of clean speech are introduced to estimate the acoustic transfer function from the user's po...
展开
References(21) This paper presents a talker localization method using only a single microphone, where phoneme hidden Markov models (HMMs) of clean speech are introduced to estimate the acoustic transfer function from the user's position. In our previous work, we proposed a Gaussian mixture model (GMM) separation for estimation of the user's position, where the observed speech is separated into the acoustic transfer function and the clean speech GMM. In this paper, we propose an improved method using phoneme HMMs for separation of the acoustic transfer function. This method expresses the speech signal as a network of phoneme HMMs, while our previous method expresses it as a GMM without considering the temporal phonetic changes of the speech signal. The support vector machine (SVM) for classifying the user's position is trained using the separated frame sequences of the acoustic transfer function. Then, for each test data set, the acoustic transfer function is separated, and the position is estimated by discriminating the acoustic transfer function. The effectiveness of this method has been confirmed by talker localization experiments performed in a room environment.
收起